Exploratory Data Analysis¶
Updates¶
- On 04 August 2023, I ran into a problem with a Python library called
nbconvert, which is the library that converts Jupyter notebooks into HTML, PDF, &c. I have upgraded the library from version 7.2.6 to version 7.7.2 and the issue with plots not displaying has been resolved. Note to also learn to use PyPlot later on. - As of the close of business day, on 04 February 2022, the US Treasury Department implemented changes to the URLs for the XML data feeds, XML schemas (the XSD files) and the XML files themselves. My analysis is updated below to reflect these changes.
Change Log¶
| Date | Revision |
|---|---|
| 08/04/2023 | Fixing issue with Python. |
| 08/01/2023 | Proofreading and adding new details. |
| 08/08/2022 | Origination |
Exploratory Analysis¶
I began my exploratory analysis back in $2018$, in search of inflation in the yield of debt securities. According to macroeconomist, Ray Dalio, the United States is entering the third and final stage in the Long Debt Cycle, which began subsequent to the Second World Wars (1945). John von Neumann died of cancer due to radiation attributed to some of the experiments he ran with nuclear weapons. Nuclear weapons are extremely dangerous to humans and dangerous to the continuity of life on Planet Earth, which is why the Third World Wars must be avoided at all costs.
The stages Ray Dalio highlighted in his "Navegating Big Debt Crises" advance as follows:
- In stage one, which Mr. Dalio terms MP1 or "Monetary Policy One (1)," interest rate charges occur naturally, without affecting central bank balance sheets.
- In stage two, or "Monetary Policy Two (2) [MP2]," interest would have accrued to the point of overwhelming principal on open balances. The higher the principal on account balances, the higher the accrued debts of individuals and businesses, which eventually led to many bankruptcies and defaults. At the same time, government deficit would have grown to high levels, and this is why quantitative easing (QE) was needed, to stimulate demand for government bonds. I claim that even though QE stimulates demand through the increase in money supply in the short-term, the problem with QE is that it not only increased the monetary base, but it also increased debt owed back with interest to government loan originators or trustees of federal debt, while also placing a burden on the balance sheet of central banks. Quantitative Easing led to the subprime loan crisis in 2007 to 2009 (I learned from somebody I know personally who lives in Spain that when people were losing their homes in Spain, they had to keep paying the debt for their mortgage bankruptcies; it was that bad). H. R. 2811 passed Congress by a slim majority of votes in the House of Representatives, and thank God it did, because the expansion of debt has its limitations; the burden is already too much for the average American.
- In stage three, or "Monetary Policy Three (3) [MP3]," a classic model is applied, based on a classic principle, as old as the Athenians in ancient Greece; this model is based on wealth distribution. Once again questions regarding "to each according to his needs" versus "to each according to his abilities or merit" are once again examined. It is a known fact that QE was not the best option. The good news is that many U.S. businesses have expanded through loan borrowing, but corporations should never expand to the point of overwhelming the critical functions of the Federal government, which is the only body capable of sponsoring disability benefits, employments benefits, retirement benefits, public education, energy and so many other functions, that albeit limited in their respective roles, are essential for society. Without government, society ceases to exist and the experiment of "democracy" ends. The deficit is severe enough to impact the lives of all U.S. Citizens in the long-term. The "Wealth Transfer" Mr. Dalio mentioned is something I have heard before, in a book by Sonyia L. Thompson called, "Seeds of Prosperity for a Financial Revolution." The Great Reverend and Apostle, Dr. Sonyia L. Thompson, obtained her PhD in biblical finance and is a minister of the Gospel, who carries the mission to "equip and edify the body of Christ." Halleluyah!
Debt is not equity, although the sum of both, for all persons, individuals and corporations, is a nation's total assets. One asset includes able-bodied adults without depends (ABAWD) in the United States. The analysis enclosed herein is a compendium to a larger work I endeavor to finish one day, which is "A Dissertation on Credit Risk: Viewing Operating Cost of Employees and Other Assets as Utility." Can employees one day be considered an utility, for which a mandatory budget must always be set aside to meet payroll obligations? Viewing employees as a liability is contrary to human nature, because we are all mammals, and mammals are social animals that compart social communion.
The social communion of viewing American employees as anything but liability to be extracted from revenue for reporting in financial statements requires the "bequeath" or willful exchange of emotions, ideas and thoughts, the fundamental pillars of our existence, without which we would all be reduced to non-being, or nothingness, as Jean Paul Sartre carefully describes in his book, "Being and Nothingness." According to the Conceptual Framework reviewed periodically by the Financial Accounting Standards Board (FASB), without realization there is no recognition, but I claim that without goodwill, there is no utilitarian realization.
Without goodwill, any good prospect to a company can be laid off indefinitely, without ever again coming to meet a prospective employer that can help him or her pull out of hardship. Lack of authenticity and goodwill is a malady that advanced society is suffering from. It leads to decline in wellbeing, decline in innovation and decline in employee morale. On 16 July 2020, the FASB issued a new chapter of its Conceptual Framework that defines 10 elements of financial statements: assets, liabilities, equity or net assets, revenues, expenses, gains, losses, investments by owners, distributions to owners and comprehensive income, released to stakeholders for public comment on a new exposure draft, prior to finilizing the new chapter which was said to replace "Concepts Statement 6," before adding as a new Accounting Standards Update (ASU).
When it comes to innovation, especially, an employee would like to have his or her fair share, based on the value of any invention said employee may have come up with. Invention valuation is an interesting grey area I would like to explore with other stakeholders, to hopefully come up with a standard for fair value appraisal, which benefits the utility valuation of an employee.
I will now continue my analysis of U.S. bonds for 2021.
Categories of Bonds¶
| $\text{Type}$ | $\text{Maturity}$ |
|---|---|
| $\text{Bill (b)}$ | $1\text{m} \leq \text{T} \lt 1\text{Y}$ |
| $\text{Note (N)}$ | $1\text{Y} \leq \text{T} \lt 10\text{Y}$ |
| $\text{Bond (B)}$ | $\quad\quad\ \text{T} \geq 10\text{Y}$ |
What is a Constant Maturity Treasury?¶
According to Investopedia, Constant Maturity is an adjustment for equivalent maturity, used by the Federal Reserve Board to compute an index based on the average yield of various Treasury securities, maturing at different periods. One would use constant maturity yield as reference for pricing various kinds of depts or otherwise fixed income securities (recall rates need to be locked by an underwriter). Fixed income securities are by far the largest type of securities in U.S. securities markets.
Nominal Yield Curve Rates vs. Real Yield Curve Rates¶
Below, $d$ denotes a discount factor or discount rate. and the assumption is that the no-arbitrage principle applies.
According to the Board of Governors of the Federal Reserve System, a nominal Treasury security is an issuance that specifies principal and interest as fixed dollar amounts to the holder. Since Treasury securities are backed by the full faith and credit of the U.S. government, the returns investors can earn on them are often used as a “risk-free” benchmark in finance research and investment practice. These securities are a promise to repay the principal (with interest if a note or a bond) to the holder (can be a public or private entity). Therefore, the real yield curve rate would be adjusted for inflation of interest rates, thereby used to determine risk-neutral cashflow on "risky" assets.
A risk-neutral, "derivative" measure would be computed using $H_T$, where $H_T$ is a random variable, to then determine fair value
$H_0 = d(0,T)E_QH_T \tag{1}$
where $E_Q$ is the expectancy of any risk-neutral, martingale measure, $Q$, that solves the equation for $H_0$.
The nominal yield curve rate adds a market average of the expected inflation risk premium to the real yield curve rate. Investors must be compensated for the risk of inflation in bond yields. The real yield determines the cashflow an investor in secondary markets can obtain by purchasing $\text{TIPS}$, which are Inflation Protected Securities offered by the U.S. Treasury department and other stakeholders that offer similar inflation protection to investors, which brings me to the following.
Suppose there was a measure $P$, alternative to $Q$, which solves (1), so that
$H_0 = E_Q(H_0) = E_P\left( \frac{dQ}{dP}H_0 \right) = \frac{dQ}{dP}H_0 = d(0, T) E_P \left( \frac{dQ}{dP}H_T \right)$,
where $\frac{dQ}{dP}$ is the Radon-Nikodym derivative of $Q$ with respect to $P$, which is still a martingale. There are many examples where this is used in real life, such as Brownian motion models and the binomial model of asset pricing. A good reference material comes from "Fundamental Theorem of Asset Pricing" by Glyn A. Holton, which I managed to retrieve from the Internet before the website was taken down for some reason.
Statement of Purpose¶
The purpose of this exploratory analysis is to draw preliminary conclusions regarding inflation in U.S. securities. Stability testing includes anomaly testing, a test for stationarity and normality testing. US Treasury Bonds, Notes and Bills are used for vanilla testing, as these are market averages. In a separate notebook, inflation simulation is ran for any security where inflation is detected to be within the confidence region of an Augmented Dickey-Fuller test. Can a Radon-Nikodym derivative be set up to model the change in measure of a martingale for inflation rates (nominal interest rates minus real interest rates) across time?
Importing Libraries¶
include("./Treas.jl");
Loading and Visualizing Datasets¶
I start by visualizing two types of CMT rates: the nominal and real CMT yield curve rates for new fixed income securities and debts maturing at different time periods. These are floating rates which a loan processor can lock for any borrower of new credit. Inflation of the average yield of the nominal causes the market, overall, to be leveraged above the real value of promises made to the holder of a security.
There is another dataset of yield curve rates called the Par Yield Curve Rate, which I will be analyzing in Treas TS Analysis III. On this information sheet, Treasury indicates that a new methodology is now used, the monotone convex (MC) spline method, instead of the historic quasi-cubic hermite spline (HS) method, for deriving better fit and interpolation of data points on the yield curve. Recall that a par yield is the coupon rate for which the nominal value of a bond is equal to its price (the rate which would make the price of the bond zero at maturity).
Nominal CMT Yield Curve Rates¶
Loading the nominal yield curve rates,
tux = DailyTreasuryYieldCurveRateData("2021", true)
| Row | Id | dt | BC_1MONTH | BC_2MONTH | BC_3MONTH | BC_6MONTH | BC_1YEAR | BC_2YEAR | BC_3YEAR | BC_5YEAR | BC_7YEAR | BC_10YEAR | BC_20YEAR | BC_30YEAR | BC_30YEARDISPLAY |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | DateTime | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 7759 | 2021-01-04T00:00:00 | 0.09 | 0.09 | 0.09 | 0.09 | 0.1 | 0.11 | 0.16 | 0.36 | 0.64 | 0.93 | 1.46 | 1.66 | 1.66 |
| 2 | 7760 | 2021-01-05T00:00:00 | 0.08 | 0.09 | 0.09 | 0.09 | 0.1 | 0.13 | 0.17 | 0.38 | 0.66 | 0.96 | 1.49 | 1.7 | 1.7 |
| 3 | 7761 | 2021-01-06T00:00:00 | 0.09 | 0.09 | 0.09 | 0.09 | 0.11 | 0.14 | 0.2 | 0.43 | 0.74 | 1.04 | 1.6 | 1.81 | 1.81 |
| 4 | 7762 | 2021-01-07T00:00:00 | 0.09 | 0.09 | 0.09 | 0.09 | 0.11 | 0.14 | 0.22 | 0.46 | 0.78 | 1.08 | 1.64 | 1.85 | 1.85 |
| 5 | 7763 | 2021-01-08T00:00:00 | 0.08 | 0.08 | 0.08 | 0.09 | 0.1 | 0.14 | 0.24 | 0.49 | 0.81 | 1.13 | 1.67 | 1.87 | 1.87 |
| 6 | 7764 | 2021-01-11T00:00:00 | 0.09 | 0.08 | 0.08 | 0.1 | 0.1 | 0.14 | 0.22 | 0.5 | 0.84 | 1.15 | 1.68 | 1.88 | 1.88 |
| 7 | 7765 | 2021-01-12T00:00:00 | 0.09 | 0.08 | 0.09 | 0.09 | 0.11 | 0.14 | 0.23 | 0.5 | 0.83 | 1.15 | 1.68 | 1.88 | 1.88 |
| 8 | 7766 | 2021-01-13T00:00:00 | 0.09 | 0.08 | 0.09 | 0.1 | 0.12 | 0.14 | 0.22 | 0.48 | 0.8 | 1.1 | 1.63 | 1.82 | 1.82 |
| 9 | 7767 | 2021-01-14T00:00:00 | 0.09 | 0.09 | 0.09 | 0.09 | 0.1 | 0.16 | 0.23 | 0.49 | 0.82 | 1.15 | 1.69 | 1.88 | 1.88 |
| 10 | 7768 | 2021-01-15T00:00:00 | 0.08 | 0.09 | 0.09 | 0.1 | 0.1 | 0.13 | 0.2 | 0.46 | 0.78 | 1.11 | 1.66 | 1.85 | 1.85 |
| 11 | 7769 | 2021-01-19T00:00:00 | 0.07 | 0.09 | 0.09 | 0.11 | 0.1 | 0.14 | 0.21 | 0.45 | 0.78 | 1.1 | 1.65 | 1.84 | 1.84 |
| 12 | 7770 | 2021-01-20T00:00:00 | 0.08 | 0.08 | 0.08 | 0.1 | 0.1 | 0.13 | 0.19 | 0.45 | 0.78 | 1.1 | 1.65 | 1.84 | 1.84 |
| 13 | 7771 | 2021-01-21T00:00:00 | 0.07 | 0.09 | 0.09 | 0.09 | 0.1 | 0.13 | 0.19 | 0.45 | 0.79 | 1.12 | 1.68 | 1.87 | 1.87 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 240 | 7998 | 2021-12-15T00:00:00 | 0.03 | 0.05 | 0.05 | 0.13 | 0.29 | 0.69 | 1.0 | 1.26 | 1.42 | 1.47 | 1.91 | 1.86 | 1.86 |
| 241 | 7999 | 2021-12-16T00:00:00 | 0.04 | 0.06 | 0.05 | 0.13 | 0.26 | 0.64 | 0.92 | 1.19 | 1.36 | 1.44 | 1.91 | 1.87 | 1.87 |
| 242 | 8000 | 2021-12-17T00:00:00 | 0.03 | 0.04 | 0.05 | 0.13 | 0.27 | 0.66 | 0.93 | 1.18 | 1.34 | 1.41 | 1.87 | 1.82 | 1.82 |
| 243 | 8001 | 2021-12-20T00:00:00 | 0.03 | 0.05 | 0.07 | 0.16 | 0.27 | 0.65 | 0.91 | 1.17 | 1.34 | 1.43 | 1.9 | 1.85 | 1.85 |
| 244 | 8002 | 2021-12-21T00:00:00 | 0.03 | 0.04 | 0.07 | 0.16 | 0.29 | 0.7 | 0.96 | 1.24 | 1.4 | 1.48 | 1.92 | 1.89 | 1.89 |
| 245 | 8003 | 2021-12-22T00:00:00 | 0.03 | 0.04 | 0.08 | 0.16 | 0.28 | 0.68 | 0.96 | 1.23 | 1.39 | 1.46 | 1.89 | 1.86 | 1.86 |
| 246 | 8004 | 2021-12-23T00:00:00 | 0.04 | 0.05 | 0.07 | 0.18 | 0.31 | 0.71 | 0.97 | 1.25 | 1.42 | 1.5 | 1.94 | 1.91 | 1.91 |
| 247 | 8005 | 2021-12-27T00:00:00 | 0.04 | 0.05 | 0.06 | 0.21 | 0.33 | 0.76 | 0.98 | 1.26 | 1.41 | 1.48 | 1.92 | 1.88 | 1.88 |
| 248 | 8006 | 2021-12-28T00:00:00 | 0.03 | 0.04 | 0.06 | 0.2 | 0.39 | 0.74 | 0.99 | 1.27 | 1.41 | 1.49 | 1.94 | 1.9 | 1.9 |
| 249 | 8007 | 2021-12-29T00:00:00 | 0.01 | 0.02 | 0.05 | 0.19 | 0.38 | 0.75 | 0.99 | 1.29 | 1.47 | 1.55 | 2.0 | 1.96 | 1.96 |
| 250 | 8008 | 2021-12-30T00:00:00 | 0.06 | 0.06 | 0.05 | 0.19 | 0.38 | 0.73 | 0.98 | 1.27 | 1.44 | 1.52 | 1.97 | 1.93 | 1.93 |
| 251 | 8009 | 2021-12-31T00:00:00 | 0.06 | 0.05 | 0.06 | 0.19 | 0.39 | 0.73 | 0.97 | 1.26 | 1.44 | 1.52 | 1.94 | 1.9 | 1.9 |
Real CMT Yield Curve Rates¶
and the real yield curve rates,
tuy = DailyTreasuryYieldCurveRateData("2019", false)
| Row | Id | dt | BC_1MONTH | BC_2MONTH | BC_3MONTH | BC_6MONTH | BC_1YEAR | BC_2YEAR | BC_3YEAR | BC_5YEAR | BC_7YEAR | BC_10YEAR | BC_20YEAR | BC_30YEAR | BC_30YEARDISPLAY |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | DateTime | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 4004 | 2019-01-02T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.98 | 0.96 | 1.07 | 1.19 | NaN |
| 2 | 4005 | 2019-01-03T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.88 | 0.88 | 0.88 | 1.01 | 1.14 | NaN |
| 3 | 4006 | 2019-01-04T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.91 | 0.91 | 0.91 | 1.05 | 1.17 | NaN |
| 4 | 4007 | 2019-01-07T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.92 | 0.92 | 0.92 | 1.06 | 1.18 | NaN |
| 5 | 4008 | 2019-01-08T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.92 | 0.91 | 0.91 | 1.03 | 1.15 | NaN |
| 6 | 4009 | 2019-01-09T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.89 | 0.9 | 0.91 | 1.05 | 1.16 | NaN |
| 7 | 4010 | 2019-01-10T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.91 | 0.91 | 0.93 | 1.08 | 1.2 | NaN |
| 8 | 4011 | 2019-01-11T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.86 | 0.86 | 0.88 | 1.05 | 1.18 | NaN |
| 9 | 4012 | 2019-01-14T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.89 | 0.88 | 0.9 | 1.07 | 1.2 | NaN |
| 10 | 4013 | 2019-01-15T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.9 | 0.9 | 0.91 | 1.08 | 1.21 | NaN |
| 11 | 4014 | 2019-01-16T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.9 | 0.9 | 0.91 | 1.06 | 1.2 | NaN |
| 12 | 4015 | 2019-01-17T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.94 | 0.95 | 0.96 | 1.1 | 1.21 | NaN |
| 13 | 4016 | 2019-01-18T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.95 | 0.95 | 0.96 | 1.1 | 1.21 | NaN |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 239 | 4242 | 2019-12-13T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.04 | 0.07 | 0.12 | 0.32 | 0.5 | NaN |
| 240 | 4243 | 2019-12-16T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.08 | 0.11 | 0.16 | 0.35 | 0.52 | NaN |
| 241 | 4244 | 2019-12-17T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.06 | 0.09 | 0.14 | 0.34 | 0.52 | NaN |
| 242 | 4245 | 2019-12-18T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.06 | 0.1 | 0.16 | 0.36 | 0.54 | NaN |
| 243 | 4246 | 2019-12-19T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.05 | 0.09 | 0.15 | 0.34 | 0.52 | NaN |
| 244 | 4247 | 2019-12-20T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.06 | 0.1 | 0.15 | 0.35 | 0.52 | NaN |
| 245 | 4248 | 2019-12-23T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.08 | 0.12 | 0.18 | 0.37 | 0.55 | NaN |
| 246 | 4249 | 2019-12-24T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.05 | 0.09 | 0.15 | 0.35 | 0.53 | NaN |
| 247 | 4250 | 2019-12-26T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.06 | 0.1 | 0.16 | 0.36 | 0.54 | NaN |
| 248 | 4251 | 2019-12-27T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.04 | 0.08 | 0.15 | 0.36 | 0.54 | NaN |
| 249 | 4252 | 2019-12-30T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.02 | 0.08 | 0.15 | 0.38 | 0.55 | NaN |
| 250 | 4253 | 2019-12-31T00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.01 | 0.07 | 0.15 | 0.39 | 0.58 | NaN |
I have the two curve rates to be analyzed in this notebook. I stumbled upon some NaNs. Usually when this happens, there is no public data for specific maturities. I will create a boolean matrix to check for empty cells.
B = [isnan(tuy[i, j]) ? Bool(1) : Bool(0) for i in 1:size(tuy)[1], j in 3:size(tuy)[2]]
250×13 Matrix{Bool}:
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
⋮ ⋮ ⋮
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 0 0 0 0 1
Checking for non-empty entries in all $250$ trading days for $2021$, using the expression
$$ \sum_{j = 1}^{n} \prod_{i = 1}^{m} I_{i \times j} $$
where prod is equivalent to logical AND, sum is equivalent to logical OR, $m$ is the number of trading days of the year in question ($\sim$ 252 on average), $n$ is the number of columns, and $I_{i \times j}$ is the empty cell indicator, piecewise-defined as
$$ I_{i \times j} = \cases{ 0 \quad \text{if not empty} \\ 1 \quad \text{if empty} } $$
Then, I have
n = size(B)[2]
sum(prod(B[:, j]) for j in 1:n)
8
It seems I have data from columns $10$ to $14$ of the tuy matrix, because the sum is $8$ (meaning that $\frac{8}{13}$ columns are empty). It can also be visually checked that columns $10$ to $14$ are nonempty. To double check this conjecture, I find that
sum(prod(B[:, j]) for j in 1:7) + 1
8
which confirms that there is data in columns $10$ to $14$ only. Great! Now, here's what the data looks like
hcat(tuy[:, 2], tuy[:, 10:14])
| Row | x1 | BC_5YEAR | BC_7YEAR | BC_10YEAR | BC_20YEAR | BC_30YEAR |
|---|---|---|---|---|---|---|
| DateTime | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 2019-01-02T00:00:00 | 1.0 | 0.98 | 0.96 | 1.07 | 1.19 |
| 2 | 2019-01-03T00:00:00 | 0.88 | 0.88 | 0.88 | 1.01 | 1.14 |
| 3 | 2019-01-04T00:00:00 | 0.91 | 0.91 | 0.91 | 1.05 | 1.17 |
| 4 | 2019-01-07T00:00:00 | 0.92 | 0.92 | 0.92 | 1.06 | 1.18 |
| 5 | 2019-01-08T00:00:00 | 0.92 | 0.91 | 0.91 | 1.03 | 1.15 |
| 6 | 2019-01-09T00:00:00 | 0.89 | 0.9 | 0.91 | 1.05 | 1.16 |
| 7 | 2019-01-10T00:00:00 | 0.91 | 0.91 | 0.93 | 1.08 | 1.2 |
| 8 | 2019-01-11T00:00:00 | 0.86 | 0.86 | 0.88 | 1.05 | 1.18 |
| 9 | 2019-01-14T00:00:00 | 0.89 | 0.88 | 0.9 | 1.07 | 1.2 |
| 10 | 2019-01-15T00:00:00 | 0.9 | 0.9 | 0.91 | 1.08 | 1.21 |
| 11 | 2019-01-16T00:00:00 | 0.9 | 0.9 | 0.91 | 1.06 | 1.2 |
| 12 | 2019-01-17T00:00:00 | 0.94 | 0.95 | 0.96 | 1.1 | 1.21 |
| 13 | 2019-01-18T00:00:00 | 0.95 | 0.95 | 0.96 | 1.1 | 1.21 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 239 | 2019-12-13T00:00:00 | 0.04 | 0.07 | 0.12 | 0.32 | 0.5 |
| 240 | 2019-12-16T00:00:00 | 0.08 | 0.11 | 0.16 | 0.35 | 0.52 |
| 241 | 2019-12-17T00:00:00 | 0.06 | 0.09 | 0.14 | 0.34 | 0.52 |
| 242 | 2019-12-18T00:00:00 | 0.06 | 0.1 | 0.16 | 0.36 | 0.54 |
| 243 | 2019-12-19T00:00:00 | 0.05 | 0.09 | 0.15 | 0.34 | 0.52 |
| 244 | 2019-12-20T00:00:00 | 0.06 | 0.1 | 0.15 | 0.35 | 0.52 |
| 245 | 2019-12-23T00:00:00 | 0.08 | 0.12 | 0.18 | 0.37 | 0.55 |
| 246 | 2019-12-24T00:00:00 | 0.05 | 0.09 | 0.15 | 0.35 | 0.53 |
| 247 | 2019-12-26T00:00:00 | 0.06 | 0.1 | 0.16 | 0.36 | 0.54 |
| 248 | 2019-12-27T00:00:00 | 0.04 | 0.08 | 0.15 | 0.36 | 0.54 |
| 249 | 2019-12-30T00:00:00 | 0.02 | 0.08 | 0.15 | 0.38 | 0.55 |
| 250 | 2019-12-31T00:00:00 | 0.01 | 0.07 | 0.15 | 0.39 | 0.58 |
Note: In this procedure, I am not taking into account the possibility that one of the potentially empty columns has data. This simplification removes the burden of finding a non-empty cell within a column of empty cells. I could have easily checked that an empty column is indeed empty, using the following procedure.
$$ \sum_{j = 3}^9 \left( \sum_{i = 1}^{m} I_{i \times j} == m \right) + \sum_{i = 1}^{m} I_{i \times n} == m $$
where $m$ is the number of trading days (# of rows) and $n$ is the number of defined maturities (# of columns).
T = 250
sum(sum(B[i, j] == true) for i in 1:T, j in 1:7) + sum(B[i, 13] == true for i in 1:T) == (250 * 8)
true
What I am checking here is that there are exactly $2,000$ empty cells in the dataset.
Using this procedure, I know with $100\%$ certainty that columns $3$ to $9$ and column $15$ are empty, because the empty columns sum to $8$. There is loss of generality using this procedure, since one level of abstraction is removed (i.e., knowing which columns ought be empty).
Saving the Datasets for Using in Other Notebooks¶
It may be worthwhile saving the original datasets for using in other notebooks.
serialize("N_CMT", tux)
serialize("R_CMT", tuy)
Cummulative Data Representation¶
At first, I thought a cummulative sum would not be very useful for analysis of coupon rates, but it is useful for tracking changes in the yield curve rate. For instance, if the slope is too steep, then there is a substantial increase in the yield curve rate (higher risk), and if it is too narrow (the curve becomes a plateau), there is low increase in the yield (low investment risk). Below is the cummulative analysis, starting with
function fix_array_expression(arr)
tmp = Array{Any}(undef, (10, 5))
I = 1:5
for C in I
tmp[:, C] =
[ sum(arr[:, 1][C][001:025]),
sum(arr[:, 1][C][026:050]),
sum(arr[:, 1][C][051:075]),
sum(arr[:, 1][C][076:100]),
sum(arr[:, 1][C][101:125]),
sum(arr[:, 1][C][126:150]),
sum(arr[:, 1][C][151:175]),
sum(arr[:, 1][C][176:200]),
sum(arr[:, 1][C][201:225]),
sum(arr[:, 1][C][226:250])
]
end
return arr = tmp
end
fix_array_expression (generic function with 1 method)
to partition the cummulative yield increases by ticks of 25 (this will help arrange the cummulative bar plots into 10 bins of 25 each). Then I have that
using IJulia, StatsPlots
mycumsum1 = cumsum(tux[:, C] for C in 10:14) ./ 250;
mycumsum2 = cumsum(tuy[:, C] for C in 10:14) ./ 250;
mycumsum1 = fix_array_expression(mycumsum1)
mycumsum2 = fix_array_expression(mycumsum2)
p = [mycumsum1[:, C] for C in 1:5]; p = cumsum.(p)
q = [mycumsum2[:, C] for C in 1:5]; q = cumsum.(q)
p = [groupedbar([p[R] q[R]], labels = ["nominal" "real"]) for R in 1:5];
ticklabel = string.([I for I in 25:25:250])
gr()
plot!(p[1], p[2], p[3], p[4], bar_position = :dodge,
title = "Nominal / Real Cumsum",
layout = (2, 2), legend = :topleft, size = (900, 500), xticks = (1:10, ticklabel))
plot!(xlab = "days", ylab = "normalized cumsum")
ticklabel = string.([I for I in 25:25:250])
gr()
plot(p[5], bar_position = :dodge,
title = "Nominal / Real Cumsum",
layout = (1, 1), legend = :topleft, size = (900, 500), xticks = (1:10, ticklabel))
plot!(xlab = "days", ylab = "normalized cumsum")
where issuances in blue are nominal rates and issuances in red are real rates.
Plotting Nominal CMT Yield Curve Rates for UpTo One (1) Year and then $Y \in [5,\ 7,\ 10,\ 20,\ 30]$ Years¶
mmx = range(25, 250, step = 25)
25:25:250
ticklabel = string.(tux[:, 2])
lb = ["1mo", "2mo", "3mo", "6mo", "1yr"]
p = [plot(tux[:, C],
label = lb[C-2], layout = (1, 1), legend = :bottomleft, size = (900, 500), xrotation = 0) for C in 3:7];
plot(p[1], p[2], p[3], p[4], layout = (2, 2))
plot(p[5])
lb = ["5yr", "7yr", "10yr", "20yr", "30yr"]
p = [plot(tux[:, C],
label = lb[C-9],
layout = (1, 1), legend = :bottomleft, size = (900, 500)) for C in 10:14];
plot(p[1], p[2], p[3], p[4], layout = (2, 2))
plot(p[5])
There seems to be some autocorrelation in the data, with a wide variance in interest rates for all notes, except that the 6 month bill and 1 year note have an exponential curve, starting after the middle of fiscal year $2021$, with the six-month closer to the end of the year (in financial mathematics, this would be a change in measure of the probability space $(\Omega, \mathcal{F}, \mathcal{P})$).
Plotting Real Interest Rates¶
lb = ["1mo", "2mo", "3mo", "6mo", "1yr"]
p = [plot(tuy[:, C],
label = lb[C-9],
layout = (1, 1), legend = :bottomleft, size = (900, 500)) for C in 10:14]
plot(p[1], p[2], p[3], p[4], layout = (2, 2))
plot(p[5])
The raw (non-transformed) data exhibit similar trends for all maturities, with the yield decreasing towards the end of the year.
Plotting Nominal$\ -\ $Real Yield Curve Rates for 5, 7, 10, 20, and 30 Year CMTs to Determine Inflation Rates.¶
mtux = Matrix(tux)
mtuy = Matrix(tuy)
N = min(size(mtux)[1], size(mtuy)[1])
mtnf = mtux[1:N, 10:14] .- mtuy[1:N, 10:14]
lb = ["5yr", "7yr", "10yr", "20yr", "30yr"]
p = [plot(mtnf[1:N, C],
label = lb[C],
layout = (1, 1), legend = :bottomleft, size = (900, 500)) for C in 1:5]
plot(p[1], p[2], p[3], p[4], layout = (2, 2))
plot(p[5])
Saving tnf for Using in Other Notebooks¶
using CSV
CSV.write("tnf.csv", (x5 = mtnf[:, 1], x7 = mtnf[:, 2], x10 = mtnf[:, 3], x20 = mtnf[:, 4], x30 = mtnf[:, 5]));
Inflation Rates for 2021¶
Reading CSV with inflation rates for US Treasury yield curve rates in $2021$,
tnf = CSV.read("tnf.csv", DataFrame)
| Row | x5 | x7 | x10 | x20 | x30 |
|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | -0.64 | -0.34 | -0.03 | 0.39 | 0.47 |
| 2 | -0.5 | -0.22 | 0.08 | 0.48 | 0.56 |
| 3 | -0.48 | -0.17 | 0.13 | 0.55 | 0.64 |
| 4 | -0.46 | -0.14 | 0.16 | 0.58 | 0.67 |
| 5 | -0.43 | -0.1 | 0.22 | 0.64 | 0.72 |
| 6 | -0.39 | -0.06 | 0.24 | 0.63 | 0.72 |
| 7 | -0.41 | -0.08 | 0.22 | 0.6 | 0.68 |
| 8 | -0.38 | -0.06 | 0.22 | 0.58 | 0.64 |
| 9 | -0.4 | -0.06 | 0.25 | 0.62 | 0.68 |
| 10 | -0.44 | -0.12 | 0.2 | 0.58 | 0.64 |
| 11 | -0.45 | -0.12 | 0.19 | 0.59 | 0.64 |
| 12 | -0.49 | -0.17 | 0.14 | 0.55 | 0.63 |
| 13 | -0.5 | -0.16 | 0.16 | 0.58 | 0.66 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 239 | 1.19 | 1.32 | 1.32 | 1.55 | 1.32 |
| 240 | 1.18 | 1.31 | 1.31 | 1.56 | 1.34 |
| 241 | 1.13 | 1.27 | 1.3 | 1.57 | 1.35 |
| 242 | 1.12 | 1.24 | 1.25 | 1.51 | 1.28 |
| 243 | 1.12 | 1.25 | 1.28 | 1.56 | 1.33 |
| 244 | 1.18 | 1.3 | 1.33 | 1.57 | 1.37 |
| 245 | 1.15 | 1.27 | 1.28 | 1.52 | 1.31 |
| 246 | 1.2 | 1.33 | 1.35 | 1.59 | 1.38 |
| 247 | 1.2 | 1.31 | 1.32 | 1.56 | 1.34 |
| 248 | 1.23 | 1.33 | 1.34 | 1.58 | 1.36 |
| 249 | 1.27 | 1.39 | 1.4 | 1.62 | 1.41 |
| 250 | 1.26 | 1.37 | 1.37 | 1.58 | 1.35 |
Plotting inflation rate for $5$yr and $7$yr notes, as well as $10$yr, $20$yr and $30$yr bonds, destinguishing between training and test datasets (currently the s_test dataset is empty, although I will try some cross validation, sampling across thirty years of historical yield curve rate data, in Treas TS Analysis II.
include("helpers.jl"); using Plots
s = tnf
lb = ["5Y", "7Y", "10Y", "20Y", "30Y"]; s_train, s_test, p = h1(s, lb, 1); p
There seems to be some correlation across maturities. I now draw QQNorm plots to visually understand the distribution of inflation rates (even though I know it is wrong to assume normal distribution). Correlation across maturities will be analyzed more in-depth in Treas TS Analysis II.
using Distributions
μ = [mean(tnf[:, i]) for i in 1:size(tnf)[2]]
σ = [std(tnf[:, i]) for i in 1:size(tnf)[2]]
t = ["QQ-Norm for 5Y CMT", "QQ-Norm for 7Y CMT", "QQ-Norm for 10Y CMT", "QQ-Norm for 20Y CMT", "QQ-Norm for 30Y CMT"]
p = [qqplot(Normal(μ[C], σ[C]), tnf[:, C], title = t[C], ylabel = "inflation") for C in 1:5]
gr(); plot(p[1], p[2], p[3], p[4], layout = (2, 2))
gr(); plot(p[5])
These are all of them taken together.
lb = ["5Y", "7Y", "10Y", "20Y", "30Y"]
h2(s_train, 1, 5, lb)
The distribution $\forall$ inflation rates is not normal; more information is needed before making any conclusion, especially when the mean and std functions take a standard normal parametric assumption. In my naivete, I thought this approach would be right, but I discovered later that it is not, although QQNorm allows visualization of "fat tails".
What I could have done instead, was to compare the distribution above to a Normal distribution, instead of fitting the data to a normal density. In fact, I could have taken a close inspection of the qqnorm plot, fitting a regression line, comparing to $\mathcal{N(\mu, \sigma^2)}$, and finally draw the line passing throught the first and third quartile of the distribution. The QQPlot functions would be as follows, taking each maturity, one at a time.
using Distributions
x = [i for i in 1:250]
y = tnf[:, 1]
N = [Normal(μ[C], σ[C]) for C in 1:5]
gr()
t = ["QQ-Plots for 5Y CMT", "QQ-Plots for 7Y CMT", "QQ-Plots for 10Y CMT", "QQ-Plots for 20Y CMT", "QQ-Plots for 30Y CMT"]
p = [plot(
qqplot(x, y, qqline = :fit), # fitted regression line of chronological data points along the abscissa (independent variable x)
# and random inflation data points along the ordinate (dependent variable y)
qqplot(N[C], y), # comparing to N(mu, var), which is known to produce incorrect results, to view the quartile-to-quartile offset from a normal distribution
qqnorm(y, qqline = :R)) for C in 1:5] # R default line passing through first and third quartiles
gr(); plot(p[1], p[2], p[3], p[4], layout = (2, 2))
plot(p[5])
Comparing the plots, I easily spot that the $10$, $20$ and $30$ year bonds are not normal, at all! It makes total sense to let qqnorm determine the distribution and not assume that any time series follows a particular model.
Annual Inflation Distribution (PDF and CDF) for $2021$¶
With the data I have for $2021$, I can obtain something close to a probability distribution function ($\text{PDF}$) to have more information about the data dispersion. I say it is "something close" to a $\text{PDF}$ because there must be sufficient observations in order to have a proper distribution. How much is sufficient? The central limit theorem ($\text{CLT}$) says that when a random variable is independent and adequately normalized, frequency of first order moments tend towards a normal distribution, even if the original variables are not normally distributed; this is the same as saying that the sampling distribution is "normal" with standard deviation equal to the standard error of the estimator.
In statistics, a $Z$-score is often taken, where
$$ Z \leq \frac{|x - \bar{X}|}{s_x} $$
Here, $s_x$ is the standard deviation of the sample (not the population).
Then I have that
$$ \bar{X} - s_xZ \leq x \leq \bar{X} + s_xZ $$
Econometricians like to talk about probabilities. How about the probability that $Z \leq |x|$ or $P(Z \leq |x|) = -x \leq P(Z \leq |x|) \leq x$? After normalizing data points into standard $Z$-scores, these probabilities would help an econometrician determine the value-at-risk (VaR), but this only works when the distribution follows a standard $Z$ variable or $\text{PDF}$ kernel distributed as a $Z$ variable.
Law of Large Numbers
The law of large numbers (LLN) is a result from mathematics about performing the same experiment repeatedly. It states that the average result of the tested statistic tends towards the expected value (E[x]) as the number of trials tend to infinity (repeated a sufficiently large number of times).
In order to replicate results for the "same" experiment, a "statistical datalogist," or data scientist, would need to take into account the reproducibility of the experiment conducted through independent trials, regardless of whether these trials are taken simultaneously or not.
Probability Distribution Function
Time series are different from traditional distributions, because they are different data types. Distribution data is nominal and time series data is ordinal (equally-spaced chronological dates run along the abscissa and ordinal rates run along the ordinate). However, it is still useful to have an annual distribution of the data.
Algorithm (Pseudo Code) for Annual Distribution
- sort X
- find range of X
- decide on the number of bins (K)
- bin_width = $\lceil{range / K\rceil}$
- $[S_0, S_1), [S_1, S_2), \cdots, [S_{n-1}, S_n]$
- count observations falling in each bin
- construct table $[\text{smallest}, \cdots, \text{largest}]$
With the pseudo-code above, I can bin the observations for each maturing security of fiscal year $2021$, first sorting the observations.
X = sort(tuy[:, 10]);
Then, I approximate the range of a continuous random variable (recalling that the sum of indefinitely small quantities is indefinite for a continuous random variable).
rng = X[end] - X[1]
0.99
Earlier I had chosen $25$ bins. Therefore,
K = 25
25
Then, putting it all together, to count the total frequency for each bin and return a custom-made "histogram" for, say, the $5\text{Y}$ T-note, I obtain
K = 25
X = sort(tuy[:, 10])
rng = X[end] - X[1]
bin_width = rng / K
T = 0; freqs = zeros(K)
k = 1
while k <= K
for x in X
if x >= T && x < T + bin_width
freqs[k] = freqs[k] + 1
end
end
T = T + bin_width
k = k + 1
end
bar(freqs)
Note I could have done the same using the histogram function in Julia. However, the histogram function may not know how to bin the data, unless you tell it how to do so.
histogram(X, bins = 25)
There seems to be some positive inflation skewness in the five year observations. Here are all five maturities taken together.
K = 25
freq = zeros(K, 5)
for I in 1:5
X = sort(tuy[:, I+9])
rng = X[end] - X[1]
bin_width = rng / K
T = 0
k = 1
while k <= K
for x in X
if x >= T && x < T + bin_width
freq[k, I] = freq[k, I] + 1
end
end
T = T + bin_width
k = k + 1
end
end
groupedbar(freq, bar_position = :dodge, orientation = :vertical, legend = :topleft)
The Central Limit Theorem makes it evident that no matter what the distribution, after repeated resampling, the estimated expected value converges to the true value. I often get confused because both the CLT and LLN involve multiple trials, but it is easier to remember by considering what it is I am converging to. Is it the expected value after doing repeated trials of the same experiment (LLN) or is it a convergence to the population parameter (CLT)?
Cummulative Distribution Function
Similarly, a cumulative distribution of inflation would look as follows (something like this was tried earlier for nominal and real $\text{CMT}$ rates).
K = 25
freq = zeros(K, size(tnf)[2])
for I in 1:size(freq)[2]
X = sort(tnf[:, I])
D = 1
rng = size(X)[1]
bin_width = Int(ceil(rng / K))
for k in 1:K
Y = X[D:D+bin_width-1]
for y in Y
freq[k, I] = freq[k, I] + y
end
D = D + bin_width
end
end
groupedbar(freq, bar_position = :dodge, orientation = :vertical, legend = :topleft)
See the deflation, particularly for the $5Y$ and $7Y$ notes, at the beginning of the business cycle for fiscal year $2021$?
Boxplot for Inflation Rates¶
A boxplot would be good for finding outliers in the average yield curve rates and also to capture basic measures of dispersion, such as the median, $\text{IQR}$ and the range.
lb = ["5y", "7y", "10y", "20y", "30y"]
function h3(M, A, B)
b = boxplot(M[:, A], label = lb[1], layout = (1, 1), legend = :bottomright, size = (900, 500))
for C in A+1:B
boxplot!(M[:, C], label = lb[C])
end
return b
end
h3(tnf, 1, 5)
Having found outliers beneath the lower limit of all CMTs, it is unwise to proceed with periodically taking mean, median, standard deviation ($\sqrt{Var(X)}$), skewness and excess kurtosis, without properly dealing with the outliers.
Winsorized Mean¶
Typically, one would take a winsorized mean with $y = 5\%$
where
$L_y = (n + 1) \times (y\ /\ 100)$
interpolating the values, using the interpolation formula:
$P_y = \lfloor X \rfloor + (L_y - \lfloor L_y \rfloor) \times (\lceil X \rceil - \lfloor X \rfloor)$
which produces
a = 0.025
l = 250a; u = 250(1 - a)
l = (size(tnf)[1] + 1)a
u = (size(tnf)[1] + 1) * (1 - a)
(l, u)
(6.275, 244.725)
Pl = [tnf[Int(floor(l)), c] + (l - Int(floor(l))) * (tnf[Int(ceil(l)), c] - tnf[Int(floor(l)), c]) for c in 1:5]
Pu = [tnf[Int(floor(u)), c] + (u - Int(floor(u))) * (tnf[Int(ceil(u)), c] - tnf[Int(floor(u)), c]) for c in 1:5]
b = DataFrame(x05 = Pu[1] .> tnf[:, 1] .> Pl[1], x07 = Pu[2] .> tnf[:, 2] .> Pl[2], x10 = Pu[3] .> tnf[:, 3] .> Pl[3], x20 = Pu[4] .> tnf[:, 4] .> Pl[4], x30 = Pu[5] .> tnf[:, 5] .> Pl[5])
| Row | x05 | x07 | x10 | x20 | x30 |
|---|---|---|---|---|---|
| Bool | Bool | Bool | Bool | Bool | |
| 1 | false | false | false | false | false |
| 2 | false | false | false | false | false |
| 3 | false | false | false | false | false |
| 4 | false | false | false | false | false |
| 5 | false | false | false | true | true |
| 6 | true | true | true | true | true |
| 7 | false | false | false | false | false |
| 8 | true | true | false | false | false |
| 9 | false | true | true | false | false |
| 10 | false | false | false | false | false |
| 11 | false | false | false | false | false |
| 12 | false | false | false | false | false |
| 13 | false | false | false | false | false |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 239 | false | false | false | false | true |
| 240 | false | false | false | false | false |
| 241 | true | true | false | false | false |
| 242 | true | true | true | true | true |
| 243 | true | true | true | false | false |
| 244 | false | false | false | false | false |
| 245 | true | true | true | true | true |
| 246 | false | false | false | false | false |
| 247 | false | false | false | false | false |
| 248 | false | false | false | false | false |
| 249 | false | false | false | false | false |
| 250 | false | false | false | false | false |
function win(tf; M = 250, N = 5)
tf = tnf
for I in 1:M
for J in 1:N
if b[I, J] == 0
if tf[I, J] >= Pu[J]
tf[I, J] = Pu[J]
else
tf[I, J] = Pl[J]
end
end
end
end
end; tf = tnf; win(tf)
tf
| Row | x5 | x7 | x10 | x20 | x30 |
|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 2 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 3 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 4 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 5 | -0.3955 | -0.0655 | 0.2345 | 0.64 | 0.72 |
| 6 | -0.39 | -0.06 | 0.24 | 0.63 | 0.72 |
| 7 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 8 | -0.38 | -0.06 | 0.2345 | 0.62175 | 0.709 |
| 9 | -0.3955 | -0.06 | 0.25 | 0.62175 | 0.709 |
| 10 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 11 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 12 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| 13 | -0.3955 | -0.0655 | 0.2345 | 0.62175 | 0.709 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 239 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.32 |
| 240 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 241 | 1.13 | 1.27 | 1.29375 | 1.53375 | 1.3265 |
| 242 | 1.12 | 1.24 | 1.25 | 1.51 | 1.28 |
| 243 | 1.12 | 1.25 | 1.28 | 1.53375 | 1.3265 |
| 244 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 245 | 1.15 | 1.27 | 1.28 | 1.52 | 1.31 |
| 246 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 247 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 248 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 249 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 250 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
Lowest Positive Value?¶
However, given that some of these values would be negative (indicating deflation), I could try something else. Maybe it is better to reassign the negative values to the lowest positive value found in the sequence?
tf = sort(Matrix(tnf), dims = 1)
function pos(x)
n = -1; i = 0;
while n < 0
i = i + 1
n = tf[i, x]
end
return n
end
M = 250; N = 5
Q = [pos(1), pos(2), pos(3), pos(4), pos(5)]
for I in 1:M
for J in 1:N
if Q[J] > tnf[I, J]
tnf[I, J] = Q[J]
end
end
end
tnf
| Row | x5 | x7 | x10 | x20 | x30 |
|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 2 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 3 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 4 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 5 | 0.02 | 0.01 | 0.2345 | 0.64 | 0.72 |
| 6 | 0.02 | 0.01 | 0.24 | 0.63 | 0.72 |
| 7 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 8 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 9 | 0.02 | 0.01 | 0.25 | 0.62175 | 0.709 |
| 10 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 11 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 12 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| 13 | 0.02 | 0.01 | 0.2345 | 0.62175 | 0.709 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 239 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.32 |
| 240 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 241 | 1.13 | 1.27 | 1.29375 | 1.53375 | 1.3265 |
| 242 | 1.12 | 1.24 | 1.25 | 1.51 | 1.28 |
| 243 | 1.12 | 1.25 | 1.28 | 1.53375 | 1.3265 |
| 244 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 245 | 1.15 | 1.27 | 1.28 | 1.52 | 1.31 |
| 246 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 247 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 248 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 249 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
| 250 | 1.15825 | 1.27825 | 1.29375 | 1.53375 | 1.3265 |
Mean ($\mu$) of Inflation Rates¶
I will take two harmonic means. The first harmonic mean is the mean inflation for inflation rates along the year. The second harmonic mean is along each independent bond category. Taking $25$ intevals of $25$ days each, I find that
using Statistics
lb = ["5Y", "7Y", "10Y", "20Y", "30Y"]
tf = []
for J in 1:5
m = 1
for I in 25:25:250
append!(tf, mean(tnf[m:I, J]))
m = I + 1
end
end
tf = reshape(tf, 10, 5)
tf = DataFrame(x05 = tf[:, 1], x07 = tf[:, 2], x10 = tf[:, 3], x20 = tf[:, 4], x30 = tf[:, 5])
p = [plot(tf[:, I]) for I in 1:5];
gr();
plot(p[1], p[2], p[3], p[4], label = ["5Y" "7Y" "10Y" "20Y"], layout = (2, 2), legend = :bottomright)
plot(p[5], label = "30Y", legend = :bottomright)
and
t = mean(eachcol(tf))
t = hcat([i for i in 1:10], t)
t = DataFrame(month = t[:, 1] * (12 / 10), rate = t[:, 2])
| Row | month | rate |
|---|---|---|
| Float64 | Float64 | |
| 1 | 1.2 | 0.333942 |
| 2 | 2.4 | 0.664452 |
| 3 | 3.6 | 1.00142 |
| 4 | 4.8 | 0.975102 |
| 5 | 6.0 | 1.07467 |
| 6 | 7.2 | 0.97032 |
| 7 | 8.4 | 1.17034 |
| 8 | 9.6 | 1.19218 |
| 9 | 10.8 | 1.29997 |
| 10 | 12.0 | 1.30228 |
using RCall
@rput(t)
R"""
library(tidyverse)
ggplot(data = t, mapping = aes(x = month, y = rate)) + geom_smooth()
"""
-- Attaching packages --------------------------------------- tidyverse 1.3.2 -- v ggplot2 3.4.0 v purrr 0.3.5 v tibble 3.1.8 v dplyr 1.0.10 v tidyr 1.2.1 v stringr 1.5.0 v readr 2.1.3 v forcats 0.5.2 -- Conflicts ------------------------------------------ tidyverse_conflicts() -- x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag()
RObject{VecSxp}
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
The last plot is the S-curve we often find in annual business cycles (evidently, there is more inflation towards the end of the year in the case of 2021). However, when analysts talk about "yield curves", they usually imply the following pictorial representation, where the annual yield curve looks like
sz = size(tnf)[2]
t = [sz / sum(1 / tnf[:, I]) for I in 1:sz]
t = hcat([i for i in [5, 7, 10, 20, 30]], t)
t = DataFrame(maturity = t[:, 1], rate = t[:, 2])
| Row | maturity | rate |
|---|---|---|
| Float64 | Float64 | |
| 1 | 5.0 | 3.99505 |
| 2 | 7.0 | 5.04834 |
| 3 | 10.0 | 5.64002 |
| 4 | 20.0 | 7.05171 |
| 5 | 30.0 | 6.25709 |
using RCall
@rput(t)
R"""
library(tidyverse)
ggplot(data = t, mapping = aes(x = maturity, y = rate)) + geom_smooth()
"""
┌ Warning: RCall.jl: Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : │ span too small. fewer data values than degrees of freedom. │ Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : │ pseudoinverse used at 4.875 │ Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : │ neighborhood radius 5.125 │ Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : │ reciprocal condition number 0 │ Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : │ There are other near singularities as well. 405.02 │ Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : │ span too small. fewer data values than degrees of freedom. │ Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : │ pseudoinverse used at 4.875 │ Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : │ neighborhood radius 5.125 │ Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : │ reciprocal condition number 0 │ Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : │ There are other near singularities as well. 405.02 │ Warning in max(ids, na.rm = TRUE) : │ no non-missing arguments to max; returning -Inf └ @ RCall C:\Users\salma\.julia\packages\RCall\Wyd74\src\io.jl:172
RObject{VecSxp}
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
See the yield curve inversion at the 30 year bond?
Median of Inflation Rates¶
$$ \mu = \frac{1}{n} \sum_{i = 1}^{n}{x_i} $$
Taking $20$ intevals of $13$ days each, I find that
tf = []
for J in 1:5
m = 1
for I in 20:19:250
append!(tf, median(tnf[m:I, J]))
m = I
end
end
tf = reshape(tf, 13, 5)
tf = DataFrame(x05 = tf[:, 1], x07 = tf[:, 2], x10 = tf[:, 3], x20 = tf[:, 4], x30 = tf[:, 5])
| Row | x05 | x07 | x10 | x20 | x30 |
|---|---|---|---|---|---|
| Any | Any | Any | Any | Any | |
| 1 | -0.48 | -0.15 | 0.16 | 0.58 | 0.64 |
| 2 | -0.24 | 0.09 | 0.41 | 0.875 | 0.905 |
| 3 | 0.25 | 0.65 | 0.93 | 1.395 | 1.32 |
| 4 | 0.36 | 0.79 | 1.07 | 1.44 | 1.35 |
| 5 | 0.375 | 0.77 | 1.05 | 1.405 | 1.355 |
| 6 | 0.415 | 0.805 | 1.085 | 1.485 | 1.4 |
| 7 | 0.62 | 0.925 | 1.105 | 1.43 | 1.285 |
| 8 | 0.5 | 0.785 | 1.0 | 1.335 | 1.17 |
| 9 | 0.705 | 1.045 | 1.31 | 1.645 | 1.51 |
| 10 | 0.695 | 1.015 | 1.25 | 1.55 | 1.425 |
| 11 | 1.03 | 1.285 | 1.45 | 1.675 | 1.515 |
| 12 | 1.125 | 1.345 | 1.415 | 1.62 | 1.42 |
| 13 | 1.175 | 1.305 | 1.315 | 1.56 | 1.325 |
p = [plot(tf[:, I]) for I in 1:5];
gr();
plot(p[1], p[2], p[3], p[4], label = ["5Y" "7Y" "10Y" "20Y"], layout = (2, 2), legend = :bottomright)
plot(p[5], label = "30Y", legend = :bottomright)
Standard Deviation of Inflation Rates¶
$ \text{Let} \\ \quad\quad s = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^n (x_i - \bar{x})^2} $
Taking $20$ intervals of $13$ days each, I find that
tf = []
for J in 1:5
m = 1
for I in 20:19:250
append!(tf, std(tnf[m:I, J]))
m = I
end
end
tf = reshape(tf, 13, 5)
tf = DataFrame(Y05 = tf[:, 1], Y07 = tf[:, 2], Y10 = tf[:, 3], Y20 = tf[:, 4], Y30 = tf[:, 5])
| Row | Y05 | Y07 | Y10 | Y20 | Y30 |
|---|---|---|---|---|---|
| Any | Any | Any | Any | Any | |
| 1 | 0.0646529 | 0.072446 | 0.071191 | 0.0615138 | 0.0591519 |
| 2 | 0.161226 | 0.178723 | 0.162367 | 0.194736 | 0.163967 |
| 3 | 0.132953 | 0.155241 | 0.159416 | 0.140135 | 0.120193 |
| 4 | 0.0670644 | 0.0842599 | 0.0882147 | 0.0860477 | 0.0816588 |
| 5 | 0.0597715 | 0.0559582 | 0.0517992 | 0.0710726 | 0.0665918 |
| 6 | 0.0820318 | 0.0791983 | 0.0793659 | 0.0732336 | 0.074302 |
| 7 | 0.0849938 | 0.0855493 | 0.0766812 | 0.072204 | 0.0654941 |
| 8 | 0.0791717 | 0.0875094 | 0.0995715 | 0.0965878 | 0.101736 |
| 9 | 0.0461662 | 0.0564195 | 0.0629453 | 0.0722914 | 0.0711022 |
| 10 | 0.10425 | 0.118273 | 0.113252 | 0.119521 | 0.115936 |
| 11 | 0.124159 | 0.0800838 | 0.0694546 | 0.0894707 | 0.103308 |
| 12 | 0.0823903 | 0.0867103 | 0.0791999 | 0.0802693 | 0.0798601 |
| 13 | 0.057308 | 0.0435618 | 0.0446743 | 0.0450146 | 0.0508869 |
p = [plot(tf[:, I]) for I in 1:5];
gr();
plot(p[1], p[2], p[3], p[4], label = ["5Y" "7Y" "10Y" "20Y"], layout = (2, 2), legend = :bottomright)
revealing a varying standard deviation of inflation rates.
Skewness and Excess Kurtosis $(K_E)$ of Inflation Rates¶
$$ \text{skewness} = \frac{1}{n}\sum_{i = 1}^n \frac{x_i - \bar{x})^3}{s^3} $$
Taking $20$ intervals of $13$ days each, I find that
tf = []
for J in 1:5
m = 1
for I in 20:19:250
append!(tf, skewness(tnf[m:I, J]))
m = I
end
end
tf = reshape(tf, 13, 5)
tf = DataFrame(Y05 = tf[:, 1], Y07 = tf[:, 2], Y10 = tf[:, 3], Y20 = tf[:, 4], Y30 = tf[:, 5])
| Row | Y05 | Y07 | Y10 | Y20 | Y30 |
|---|---|---|---|---|---|
| Any | Any | Any | Any | Any | |
| 1 | -0.568 | -0.653496 | -0.747529 | -1.02334 | -0.973438 |
| 2 | 0.784567 | 0.703553 | 0.516586 | 0.495628 | 0.349277 |
| 3 | -0.219969 | -0.175705 | -0.133237 | -0.202202 | -0.15677 |
| 4 | 0.256072 | -0.0269077 | -0.0572981 | 0.0296169 | 0.279735 |
| 5 | -0.308545 | -0.148538 | -0.0563201 | 0.276887 | 0.253119 |
| 6 | 0.0796602 | 0.461888 | 0.633026 | 0.801397 | 0.574063 |
| 7 | -0.481157 | -0.446476 | -0.367966 | -0.182191 | -0.0501671 |
| 8 | 0.779782 | 1.24712 | 1.43727 | 1.38136 | 1.42924 |
| 9 | 0.0200284 | -0.0435858 | -0.313658 | -0.258803 | -0.612505 |
| 10 | -0.131124 | -0.074698 | -0.13804 | -0.103081 | -0.130645 |
| 11 | -0.117827 | -0.312393 | -0.540316 | -0.0805298 | -0.113703 |
| 12 | -0.108004 | -0.0985711 | -0.349394 | -0.521659 | -0.679189 |
| 13 | -0.862436 | -0.425048 | -0.0898281 | -0.497283 | -0.327017 |
p = [plot(tf[:, I]) for I in 1:5];
gr();
plot(p[1], p[2], p[3], p[4], label = ["5Y" "7Y" "10Y" "20Y"], layout = (2, 2), legend = :bottomright)
plot(p[5], label = "30Y", legend = :bottomright)
which reveals changing skewness throughout the annual business cycle for $2021$, and excess kurtosis
$$ K_E = \frac{1}{n}\sum_{i = 1}^n (x_i - \bar{x})^4 / s^4 - 3 $$
tf = []
for J in 1:5
m = 1
for I in 20:19:250
append!(tf, kurtosis(tnf[m:I, J]))
m = I
end
end
tf = reshape(tf, 13, 5)
tf = DataFrame(Y05 = tf[:, 1], Y07 = tf[:, 2], Y10 = tf[:, 3], Y20 = tf[:, 4], Y30 = tf[:, 5])
| Row | Y05 | Y07 | Y10 | Y20 | Y30 |
|---|---|---|---|---|---|
| Any | Any | Any | Any | Any | |
| 1 | 0.332514 | 0.303221 | 0.248556 | 0.919989 | 1.10748 |
| 2 | -0.528393 | -0.717789 | -0.975277 | -0.868903 | -1.00005 |
| 3 | -1.07537 | -1.21409 | -1.35263 | -1.5394 | -1.6178 |
| 4 | -1.12993 | -1.34246 | -1.23972 | -1.03042 | -0.735339 |
| 5 | -1.08331 | -1.30471 | -0.977853 | -1.18482 | -1.15341 |
| 6 | -1.41138 | -1.01543 | -0.913586 | -0.551983 | -0.566703 |
| 7 | -0.527349 | -0.734567 | -0.628594 | -0.645988 | -1.0153 |
| 8 | -0.22559 | 0.44478 | 0.791461 | 0.887368 | 0.91249 |
| 9 | -0.776073 | -0.996903 | -0.854604 | -0.307561 | -0.630023 |
| 10 | -1.26093 | -1.36557 | -1.32386 | -1.29593 | -1.23222 |
| 11 | -1.52334 | -0.64524 | -1.06372 | -0.850724 | -0.766125 |
| 12 | -0.678207 | -0.549788 | -0.599095 | -0.207324 | 0.0803893 |
| 13 | -0.144354 | -1.21566 | -1.10028 | -0.837983 | -0.650208 |
p = [plot(tf[:, I]) for I in 1:5];
gr();
plot(p[1], p[2], p[3], p[4], label = ["5Y" "7Y" "10Y" "20Y"], layout = (2, 2), legend = :bottomright)
plot(p[5], label = "30Y", layout = (1, 1), legend = :bottomright)
Distribution Determination (Density Testing and PDF Kernel Selection)¶
To determine what the density of the the data between columns 10:14 (representing rates for the 5, 7, 10, 20 and 30 Year Treasury securities) is going to be, I will run a series of tests to determine an adequate probability density function (PDF).
Nominal ADF Test for Stationarity (Augmented Dickey-Fuller Test)¶
The nominal CMT yield curve rates are usually not adjusted for inflation. The aim is to determine inflation. Stabilized inflation rates are unlikely to take place in a Market Economy, where there are real laws of supply and demand, and equilibrium doesn't just happen through an "invisible hand" unless the pricing mechanics are in place to allow cooperation without coercion (F. von Hyek; Milton Friedman).
It is well worthwhile to determine any stationarity in the stochastic process, corresponding to each CMT yield curve rate series and remove nonstationarity for the purposes of making better forecasts.
using HypothesisTests
[ADFTest(tux[:, I], Symbol("constant"), 5) for I in 10:14]
5-element Vector{ADFTest}:
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0102952
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.7812
Details:
sample size in regression: 245
number of lags: 5
ADF statistic: -0.920298
Critical values at 1%, 5%, and 10%: [-3.45667 -2.87312 -2.57294]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0242851
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.3960
Details:
sample size in regression: 245
number of lags: 5
ADF statistic: -1.76883
Critical values at 1%, 5%, and 10%: [-3.45667 -2.87312 -2.57294]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0325764
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.2595
Details:
sample size in regression: 245
number of lags: 5
ADF statistic: -2.0634
Critical values at 1%, 5%, and 10%: [-3.45667 -2.87312 -2.57294]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0319665
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.2770
Details:
sample size in regression: 245
number of lags: 5
ADF statistic: -2.02199
Critical values at 1%, 5%, and 10%: [-3.45667 -2.87312 -2.57294]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0219538
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.5576
Details:
sample size in regression: 245
number of lags: 5
ADF statistic: -1.45091
Critical values at 1%, 5%, and 10%: [-3.45667 -2.87312 -2.57294]
In the ADF test for nominal interest rates I fail to reject $H_0$ (that the stochastic process is stationary) at the $5\%$ confidence level. I will produce a comparable test for the real interest rates of fiscal year 2021. Note that earlier I wrote confidence levels between $25.95$ and $78.12$, but this would be wrong as these are p-values, and these p-values are telling me there is no signicant evidence to reject the null hypothesis ($p < 0.05$).
Real ADF Test for Stationarity (Augmented Dickey-Fuller Test)¶
The real yield curve rates are always adjusted for inflation in market rates. I then have
[ADFTest(tuy[:, I], Symbol("constant"), 5) for I in 10:14]
5-element Vector{ADFTest}:
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0163393
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.4155
Details:
sample size in regression: 244
number of lags: 5
ADF statistic: -1.73037
Critical values at 1%, 5%, and 10%: [-3.45678 -2.87317 -2.57297]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0140648
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.4187
Details:
sample size in regression: 244
number of lags: 5
ADF statistic: -1.72426
Critical values at 1%, 5%, and 10%: [-3.45678 -2.87317 -2.57297]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0121503
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.4720
Details:
sample size in regression: 244
number of lags: 5
ADF statistic: -1.62137
Critical values at 1%, 5%, and 10%: [-3.45678 -2.87317 -2.57297]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0123815
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.5118
Details:
sample size in regression: 244
number of lags: 5
ADF statistic: -1.54371
Critical values at 1%, 5%, and 10%: [-3.45678 -2.87317 -2.57297]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.0116842
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.5705
Details:
sample size in regression: 244
number of lags: 5
ADF statistic: -1.42442
Critical values at 1%, 5%, and 10%: [-3.45678 -2.87317 -2.57297]
In the case of real yield curve rates, I fail to reject the $H_0$ hypothesis at the 5% level. Since there is rejection failure for both real and nominal, I will require a first order differencing transformation for these two types of rates, due to nonstationarity. I need to take the transformation for inflation rates, since the aim of my study is to determine the financial leverage of securities, on the average.
tnf = CSV.read("tnf.csv", DataFrame)
X = diff(Matrix(tnf), dims = 1)
249×5 Matrix{Float64}:
0.14 0.12 0.11 0.09 0.09
0.02 0.05 0.05 0.07 0.08
0.02 0.03 0.03 0.03 0.03
0.03 0.04 0.06 0.06 0.05
0.04 0.04 0.02 -0.01 -2.22045e-16
-0.02 -0.02 -0.02 -0.03 -0.04
0.03 0.02 2.22045e-16 -0.02 -0.04
-0.02 -1.11022e-16 0.03 0.04 0.04
-0.04 -0.06 -0.05 -0.04 -0.04
-0.01 0.0 -0.01 0.01 0.0
-0.04 -0.05 -0.05 -0.04 -0.01
-0.01 0.01 0.02 0.03 0.03
0.0 -0.01 -0.01 -0.01 -0.01
⋮
0.09 0.08 0.08 0.06 0.04
-0.01 -0.01 -0.01 0.01 0.02
-0.05 -0.04 -0.01 0.01 0.01
-0.01 -0.03 -0.05 -0.06 -0.07
0.0 0.01 0.03 0.05 0.05
0.06 0.05 0.05 0.01 0.04
-0.03 -0.03 -0.05 -0.05 -0.06
0.05 0.06 0.07 0.07 0.07
0.0 -0.02 -0.03 -0.03 -0.04
0.03 0.02 0.02 0.02 0.02
0.04 0.06 0.06 0.04 0.05
-0.01 -0.02 -0.03 -0.04 -0.06
Running ADF Test Again to Determine Stationarity¶
[ADFTest(X[:, C], Symbol("constant"), 5) for C in 1:5]
5-element Vector{ADFTest}:
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -1.27451
Test summary:
outcome with 95% confidence: reject h_0
p-value: <1e-09
Details:
sample size in regression: 243
number of lags: 5
ADF statistic: -7.27819
Critical values at 1%, 5%, and 10%: [-3.45689 -2.87322 -2.57299]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -1.16326
Test summary:
outcome with 95% confidence: reject h_0
p-value: <1e-09
Details:
sample size in regression: 243
number of lags: 5
ADF statistic: -7.05671
Critical values at 1%, 5%, and 10%: [-3.45689 -2.87322 -2.57299]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -1.17331
Test summary:
outcome with 95% confidence: reject h_0
p-value: <1e-09
Details:
sample size in regression: 243
number of lags: 5
ADF statistic: -7.10597
Critical values at 1%, 5%, and 10%: [-3.45689 -2.87322 -2.57299]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -1.17759
Test summary:
outcome with 95% confidence: reject h_0
p-value: <1e-09
Details:
sample size in regression: 243
number of lags: 5
ADF statistic: -7.04052
Critical values at 1%, 5%, and 10%: [-3.45689 -2.87322 -2.57299]
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -1.21309
Test summary:
outcome with 95% confidence: reject h_0
p-value: <1e-10
Details:
sample size in regression: 243
number of lags: 5
ADF statistic: -7.57134
Critical values at 1%, 5%, and 10%: [-3.45689 -2.87322 -2.57299]
With p-values sufficiently small, all ADF tests pass, with 95% confidence.
Test for Normality (Jarque-Bera Test)¶
I want to determine if the transformed dataset has a normal distribution. The Jarque-Bera Lagrange Multiplier Test attempts to determine the normality of the dataset, by testing higher order moments, such as skewness and kurtosis.
[JarqueBeraTest(X[:, C]) for C in 1:5]
5-element Vector{JarqueBeraTest}:
Jarque-Bera normality test
--------------------------
Population details:
parameter of interest: skewness and kurtosis
value under h_0: "0 and 3"
point estimate: "0.33317455747003505 and 3.57795797777328"
Test summary:
outcome with 95% confidence: reject h_0
one-sided p-value: 0.0177
Details:
number of observations: 249
JB statistic: 8.07234
Jarque-Bera normality test
--------------------------
Population details:
parameter of interest: skewness and kurtosis
value under h_0: "0 and 3"
point estimate: "0.13494434481530984 and 3.2697311102496625"
Test summary:
outcome with 95% confidence: fail to reject h_0
one-sided p-value: 0.4699
Details:
number of observations: 249
JB statistic: 1.51055
Jarque-Bera normality test
--------------------------
Population details:
parameter of interest: skewness and kurtosis
value under h_0: "0 and 3"
point estimate: "-0.07758034420872718 and 3.261679559022148"
Test summary:
outcome with 95% confidence: fail to reject h_0
one-sided p-value: 0.6187
Details:
number of observations: 249
JB statistic: 0.960217
Jarque-Bera normality test
--------------------------
Population details:
parameter of interest: skewness and kurtosis
value under h_0: "0 and 3"
point estimate: "-0.0052506316707056945 and 3.348957834784301"
Test summary:
outcome with 95% confidence: fail to reject h_0
one-sided p-value: 0.5314
Details:
number of observations: 249
JB statistic: 1.26452
Jarque-Bera normality test
--------------------------
Population details:
parameter of interest: skewness and kurtosis
value under h_0: "0 and 3"
point estimate: "-0.12796988258018926 and 3.261161219070522"
Test summary:
outcome with 95% confidence: fail to reject h_0
one-sided p-value: 0.4998
Details:
number of observations: 249
JB statistic: 1.38724
Except for the five year note, which appears to be normal, the J-B Normality Test does not pass within the confidence region for any other constant maturity Treasury coupon security.
Conclusions¶
No top-down probability density function (PDF) was determined, due to the volatility of the time series plots. A data-driven modeling approach is needed to come up with better forecasts.
There were some price anomalies detected for Treasury securities, as per the boxplots. Further analysis will be needed to come up with a better and more precise time series model, although this would suffice, given that I am only conducting an exploratory analysis. A good question to ask is, "Would a lag of $5$ be sufficient? Is it necessary or do I need to try with a lower lag?"
The inflation rates for fiscal year $2021$ indicate that there is inflation in the US securities market (more than in 2019, if I were to make a back-to-back comparison). In the following notebook, titled: Treas TS Analysis, I will be investigating ARIMA models for $T \in [5, 7, 10, 20, 30]$, as I take a close inspection into my data-driven modeling approach.